Incrementally Assessing Cluster Tendencies with a Maximum Variance Cluster Algorithm
نویسندگان
چکیده
A straightforward and efficient way to discover clustering tendencies in data using a recently proposed Maximum Variance Clustering algorithm is proposed. The approach shares the benefits of the plain clustering algorithm with regard to other approaches for clustering. Experiments using both synthetic and real data have been performed in order to evaluate the differences between the proposed methodology and the plain use of the Maximum Variance algorithm. According to the results obtained, the proposal constitutes an efficient and accurate alternative.
منابع مشابه
A Maximum Variance Cluster Algorithm
We present a partitional cluster algorithm that minimizes the sum-of-squared-error criterion while imposing a hard constraint on the cluster variance. Conceptually, hypothesized clusters act in parallel and cooperate with their neighboring clusters in order to minimize the criterion and to satisfy the variance constraint. In order to enable the demarcation of the cluster neighborhood without cr...
متن کاملCluster-Based Cumulative Ensembles
In this paper, we propose a cluster-based cumulative representation for cluster ensembles. Cluster labels are mapped to incrementally accumulated clusters, and a matching criterion based on maximum similarity is used. The ensemble method is investigated with bootstrap re-sampling, where the k-means algorithm is used to generate high granularity clusterings. For combining, group average hierarch...
متن کاملA New Algorithm for Cluster Initialization
Clustering is a very well known technique in data mining. One of the most widely used clustering techniques is the kmeans algorithm. Solutions obtained from this technique are dependent on the initialization of cluster centers. In this article we propose a new algorithm to initialize the clusters. The proposed algorithm is based on finding a set of medians extracted from a dimension with maximu...
متن کاملحل مساله خوشهبندی ظرفیتدار با استفاده از روشهای مبتنی بر الگوریتمهای شبیه سازی تبریدی و ژنتیک
The Capacitated Clustering Problem (CCP) is a classical location problem with various applications in data mining. In the capacitated clustering problem, a set of n entities is to be partitioned into p disjoint clusters, such that the total dissimilarity within each cluster is minimized subject to constraints on maximum cluster capacity. Dissimilarity of a cluster is the sum of the dissimilar...
متن کامل